Back

Critical Care

Springer Science and Business Media LLC

Preprints posted in the last 7 days, ranked by how well they match Critical Care's content profile, based on 14 papers previously published here. The average preprint has a 0.02% match score for this journal, so anything above that is already an above-average fit.

1
An AI-assisted feasibility evaluation of three photoplethysmography-derived microvascular reactivity signals in MIMIC-IV-WDB v0.1.0

Landry, T. C.; Kim, Y.

2026-06-06 health informatics 10.64898/2026.06.03.26354863 medRxiv
Top 0.1%
3.6%
Show abstract

Background. Capillary refill time, an examiner-dependent bedside test of distal microvascular perfusion, has become a resuscitation target in septic shock,1,2,3,4 motivating a continuous surrogate computed from the photoplethysmogram (PPG, the optical waveform the pulse oximeter on every ICU patient already records).5,6,7,8 Objective. We attempted three PPG-derived candidate measures on the MIMIC-IV Waveform Database (MIMIC-IV-WDB v0.1.0) and asked, by inspecting randomly drawn examples, whether each captured its intended physiology before any downstream modeling. Methods. MIMIC-IV-WDB v0.1.09 was linked to MIMIC-IV.10 The signals were a cuff-anchored perfusion-index recovery (reactive hyperemia when the cuff shares an arm with the probe), a slow Mayer-wave-band power ratio of the perfusion index (sympathetic vasomotor tone), and a per-beat diastolic exponential decay time constant (a refill-like recovery time). For each signal we drew 10 random examples at a fixed seed and checked them against a checklist fixed in advance. Each was read by the author and, separately, by MedGemma 1.5, a multimodal medical language model run locally. A synthetic test with a known time constant checked the third signal. Results. The cuff-anchored signal showed the expected occlusion-reperfusion shape on 268 of 6,236 evaluable cuff cycles (4.30%) in 15 of 19 patients, consistent with opposite-limb placement of the probe and cuff. The slow-band ratio returned a stable cohort value, but a clear, stationary peak appeared in only4 of 10 random windows. The per-beat fit met its goodness-of-fit threshold in 10 of 10 beats, yet a cardiac-frequency heuristic flagged a possible fit on the heart-rate oscillation in 7 of 10, and in 5 of 17 patients the time constant lay where an exponential is indistinguishable from a straight line. A 0.5Hz high-pass pre-filter implanted its own approximately 318 ms time constant regardless of truth. The language model tracked the human on clear positives but reported the pattern present on every call it returned, never absent. Conclusions. Two of the three candidate signals did not reflect their intended physiology in most examples, and the third was constrained by sensor placement. Inspecting a few random raw inputs against a checklist written in advance is an inexpensive upstream check before downstream inference on PPG-derived microvascular signals.

2
A Clinical Predictor of Lung Molecular Endotype Identifies Heterogeneity in Corticosteroid Response in Severe COVID-19: an Emulated Target Trial

Sines, B.; Hagan, R.; Jiang, X.; Pavlechko, E.; McClain, S.; Hunt, X.; Florou-Moreno, J.; Acquadro, J.; Risa, G.; Valsaraj, V.; Schisler, J.; Wolfgang, M. C.

2026-06-10 intensive care and critical care medicine 10.64898/2026.06.08.26355201 medRxiv
Top 0.1%
3.5%
Show abstract

ABSTRACT Background: Corticosteroids reduce mortality in severe COVID-19 requiring oxygen or invasive mechanical ventilation, yet emerging data suggest that SARS-CoV-2-associated acute lung injury is biologically heterogeneous and that treatment response may vary across molecularly defined disease states. Lung-derived molecular endotypes of severe COVID-19-associated acute lung injury have been described, but direct molecular profiling is not routinely available at the bedside. We evaluated whether a clinical predictor of previously defined lung molecular endotype identifies heterogeneity in corticosteroid treatment effect among mechanically ventilated patients with COVID-19. Methods: We utilized a single-center cohort of 5,000 patients with COVID-19 treated at the University of North Carolina Hospital between January 1, 2020, and December 31, 2022, to emulate a target trial assessing the effect of corticosteroid receipt on mortality, length of stay, and incident organ support. Confounding was addressed through inverse probability of treatment weighting (IPTW). Outcomes for severely ill patients requiring mechanical ventilation were compared to the RECOVERY trial results, with subsequent moderation analysis and stratified analysis by clinically predicted lung molecular endotype and vaccination status. The primary outcome was 28-day mortality. Secondary Outcomes were time to discharge alive and progression to additional organ support. Results: This emulated target trial showed a directionally favorable but non-statistically significant association between corticosteroid treatment and reduced 28-day mortality in patients requiring mechanical ventilation for SARS-CoV-2 infection. A clinical predictor of lung molecular endotype moderated the effect of corticosteroids on 28-day mortality (p-value for interaction 0.038) and identified distinct predicted endotype-specific treatment effect. Corticosteroid treatment was associated with lower 28-day mortality in the predicted Hyper-Inflammatory endotype (OR 0.62, 95% CI 0.39, 0.99) but not in the predicted Metabolic Dysregulation endotype (OR 1.15, 95% CI 0.82, 1.61). We did not detect significant effect modification by vaccination status (p-value for interaction 0.65), although inference was limited by the small, vaccinated subgroup (28-mortality OR 0.78, 95% CI 0.37, 1.65 in vaccinated vs 0.94, 95% CI 0.70, 1.26 in unvaccinated). Conclusions: In this target trial emulation of mechanically ventilated patients with severe COVID-19, corticosteroid treatment showed a directionally favorable but non-statistically significant association with reduced 28-day mortality in the overall cohort. However, a clinical predictor of lung molecular endotype identified significant heterogeneity in treatment effect, with benefit concentrated in the predicted Hyper-Inflammatory endotype and no apparent benefit in the predicted Metabolic Dysregulation endotype. These findings support prospective validation of clinically deployable endotype-guided corticosteroid treatment strategies in acute lung injury and ARDS.

3
Reprogramming of Iron and Oxygen Metabolism Across the Spectrum of Primary Aldosteronism

Parisien-La Salle, S.; Tsai, C. H.; Newman, A. J.; Heydarpour, M.; Mahrokhian, S.; Hanna, I.; Brown, J. M.; Waikar, S.; Moussa, M.; Vaidya, A.

2026-06-10 endocrinology 10.64898/2026.06.09.26355256 medRxiv
Top 0.1%
2.8%
Show abstract

Background: Pathologic aldosteronism induces oxidative stress, tissue injury, and increases in hemoglobin. Conversely, aldosterone antagonist therapy decreases hemoglobin. Whether these effects are attributable to aldosterone-mediated changes in iron and oxygen metabolism is unknown. Methods: The plasma proteome of participants with overt primary aldosteronism (PA) (n=50) was compared with participants without overt PA (n=61). To isolate aldosterone-dependent effects, participants without overt PA underwent oral sodium suppression testing to quantify the magnitude of renin-independent aldosterone production, enabling monotonic dose-response analyses across the continuum of renin-independent aldosteronism (subclinical to overt PA). Differential abundance testing was performed using empirical Bayes linear modeling, followed by Reactome pathway enrichment analysis and covariate-adjusted sensitivity analyses. To validate clinical relevance, aldosterone dose-response trends with blood count parameters were examined in this cohort, and an independent population-based cohort of 5,713 people with hypertension. Results: 903 proteins in the peripheral circulation were differentially abundant in overt PA versus participants without PA. The most significantly increased protein in overt PA was CYBRD1, involved in iron reduction and absorption. Pathway enrichment identified 16 iron- and heme-related pathways, including erythropoietin signaling, heme biosynthesis and mitochondrial iron-sulfur cluster biogenesis, with increases in heme and erythroid proteins and decreases in mitochondrial iron-sulfur proteins. Linear aldosterone dose-dependent trend analyses across the PA continuum further supported this signature, identifying progressive increases in hemoglobin subunits (HBA1/HBB), heme-related proteins (HMBS, UROS, AMBP, HPX, GLO1) and erythrocyte oxygen handling enzymes (CA1/CA3), alongside progressive reductions in mitochondrial electron transport chain subunits (CYCS, ETFA). These proteomic changes corresponded with aldosterone dose-dependent increases in red blood cell count, hemoglobin, and hematocrit, in this cohort and another population-based cohort. Conclusion: The continuum of PA is characterized by a progressive shift away from mitochondrial oxidative phosphorylation and toward increased intestinal iron absorption, preferential iron transport over storage, and enhanced heme synthesis and recycling, possibly reflecting cellular pseudohypoxia and systemic adaptations to increase oxygen delivery. These findings provide a novel mechanistic basis for aldosterone-mediated tissue injury and the benefits of aldosterone-directed therapy.

4
Liver biopsy confirms precise and efficient correction of SERPINA1 after in vivo Base Editing in a Patient with Alpha-1 Antitrypsin Deficiency

Krooss, S. A.; Yang, T.; Yuan, Q.; Drick, N.; Sgodda, M.; Held, J.; Behrendt, P.; Hartleben, B.; Koczulla, R.; Ma, X.; Liu, Y.; Wedemeyer, H.; Janciauskiene, S.; Di Donato, N.; Cantz, T.; Wang, E.; Wu, Y.; Hoeper, M.; Xia, Q.; Ott, M.

2026-06-09 genetic and genomic medicine 10.64898/2026.06.01.26354551 medRxiv
Top 0.2%
2.5%
Show abstract

Background: Alpha-1 antitrypsin deficiency (AATD) caused by the PI*ZZ mutation (Glu342Lys) results in hepatic accumulation of misfolded AAT-Z protein and reduced circulating AAT levels, leading to progressive liver disease and emphysema. Gene correction therapy represents a potentially curative approach by directly correcting the underlying genetic defect. We report the first case of successful hepatic gene correction with early histological and functional assessment. Methods/Case presentation: We report the case of a 66-year-old male patient with PI*ZZ AATD who underwent gene correction therapy within the YOLT-202 phase I/Ia clinical trial (clinical trial.gov ID NCT07193615). Ten weeks post treatment a liver biopsy was performed to re-evaluate pre-existing F2 liver fibrosis as measured by elastography before entering the study. Serum samples allowed functional assessment of the AAT-mediated elastase inhibition. Results: Liver biopsy did not show signs of hepatic inflammation and demonstrated 54% (Sanger) and 57% (Illumina) gene correction rate of the PI*ZZ variant on the DNA level with no bystander edits or off-target effects. Following a transient elevation of transaminases during the early post-treatment period, liver enzymes normalized. Monthly serum AAT measurements demonstrated biologically active and stable therapeutic levels throughout follow-up. Conclusions: This case demonstrates efficient and precise hepatic gene correction without concerning histological alterations and with substantial improvement of functional parameters, supporting the feasibility and safety of gene editing approaches for AATD.

5
Sensor Geometry, Not Signal Processing, Limits Opportunistic Detection of Capillary-Refill-Like Signals by Rule-Based and Language-Model Methods in Archived ICU Waveforms

Landry, T. C.; Kim, Y.

2026-06-09 intensive care and critical care medicine 10.64898/2026.06.07.26355129 medRxiv
Top 0.2%
2.0%
Show abstract

Background. Capillary refill time is a resuscitation target in septic shock,1-4 but bedside measurement is examiner-dependent. An ICU monitor co-records a photoplethysmogram on the pulse oximeter and intermittent noninvasive blood pressure cuff cycles; if the probe and the cuff share a limb, each cycle is an unplanned vascular occlusion test on the distal microvascular bed. Standard practice places the two on opposite limbs. Objective. To measure how often, in MIMIC-IV-WDB v0.1.0, charted cuff cycles show the photoplethysmographic morphology expected of a same-limb cuff and probe, and to characterize the candidate capillary refill-like signal when that morphology is present. Methods. MIMIC-IV-WDB v0.1.05 was linked to the MIMIC-IV clinical database.6 A pre-registered rule-based detector identified candidate occlusion-reperfusion signatures on the 1-Hz perfusion-index envelope around each charted cuff timestamp. The primary endpoint was the proportion of cuff cycles suitable for analysis that were detector-positive at a 15-second reperfusion threshold, with 95% confidence intervals estimated by resampling patients at a fixed seed. A secondary analysis used a locally hosted multimodal language model (a Gemma-3 derivative on a non-device server) to adjudicate the same signature on perfusion-index plots; no MIMIC-IV-WDB content left the workstation. Results. Of 9,224 charted cuff cycles, 8,909 had a usable pulse-oximeter waveform, and 268 cycles in 15 patients (4.30% of the 6,236 cuff cycles suitable for analysis, 95% CI 2.60 to 6.03) met the primary 15-second threshold. The language model adjudicated the same cycles and called 1,367 of the 8,909 cycles with a usable waveform (15.34%) signature-present, roughly five times the detectors count. Because no laterality ground truth exists, agreement with a single blinded reader served as the comparator rather than accuracy. The two methods were about equally concordant with the reader: precision was 0.25 (95% CI 0.14 to 0.39) for the detector and 0.24 (95% CI 0.10 to 0.35) for the language model, although reweighting to the full population of cycles with a usable waveform lowered the language model to 0.030 (95% CI 0.009 to 0.053). These estimates are reference-limited: a blinded re-read of a 150-card subsample showed only moderate intra-rater reliability (Cohen {kappa} 0.46 to 0.59) with systematic undercalling on the first pass, and rescoring against the corrected re-read roughly doubled precision for both methods. Conclusions. Opportunistic extraction of capillary refill-like signals from archived ICU pulse oximetry is limited in two distinct ways. First, sensor geometry limits how often the signal is recordable: cuff cycles rarely show the morphology expected of a same-limb cuff and probe pair, consistent with opposite-limb placement, so the bottleneck is geometry rather than signal processing. Second, the modest reliability of morphology adjudication limits how well any single flagged cycle can be confirmed: against a blinded reader the detector is a usable screen but a noisy confirmer, the reference is itself only moderately reliable, and the language model is no more concordant despite flagging many more cycles. The minority of cycles in which the morphology appears contain a candidate signal that may merit prospective study under controlled placement with laterality recorded.

6
A Machine Learning Pipeline for Scalable Annotation of Patient-Ventilator Dyssynchrony from Bedside Ventilator Data

Tlimat, A.; Mayampurath, A.; Safadi, S.; Kalehoff, J.; Seam, N.; Johnson, R. B.; Morris, P.; Bodduluri, S.; Bhatt, S. P.; Afshar, M.

2026-06-12 intensive care and critical care medicine 10.64898/2026.06.11.26355207 medRxiv
Top 0.2%
1.8%
Show abstract

Objective: Patient-ventilator dyssynchrony (PVD) is a common and clinically consequential problem in critically ill patients receiving invasive mechanical ventilation. Yet automated identification of PVD subtypes at scale remains an unmet clinical need, owing to the lack of large annotated bedside waveform datasets. Methods: We developed and validated a semi-supervised algorithm for automated annotation of PVD. In two medical ICUs at a tertiary academic center, bedside devices continuously collected airway flow and pressure waveforms from the ventilators. We developed a software interface with an information retrieval system that grouped similar breaths for expert human review, yielding 1,542,296 labeled breaths across eight categories: 2 labels for breath delivery mode, 5 labels for PVD subtypes, and 1 label denoting a normal breath. Two pulmonary physicians with expertise in ventilator training and education provided the expert reference labels. We trained an initial classification model on a model-derivation set of 771,148 breaths (divided into training and validation) and evaluated it on a hold-out test set of 771,149 breaths A semi-supervised approach was utilized to extend labeling to an additional 12,965,000 unlabeled breaths. Results: The supervised model performed well across all labels, with Macro-F1 scores between 0.96 and 1.00. Semi-supervised learning across 12 rounds expanded the training set from 771,148 to 8,563,995 breaths without significant performance degradation. Conclusion: We developed a practical and scalable system for automated PVD annotation that performed well across all subtypes. This work provides a reproducible foundation for automated PVD labeling to support the development of machine-learning-based clinical decision support systems for identifying patient-level asynchrony.

7
Identifying Clinical Diagnostic Trajectories Associated With Suicide Death Using Temporal Sequence Mining of Linked Claims and Mortality Data

Belouali, A.; Kitchen, C.; Haroz, E.; Lehmann, H.; Nestadt, P. S.; Wilcox, H. C.; Kharrazi, H.

2026-06-10 health informatics 10.64898/2026.06.08.26355231 medRxiv
Top 0.4%
1.0%
Show abstract

Background: Most approaches to suicide risk assessment consider clinical conditions as independent risk factors, potentially overlooking prognostic information in the order in which conditions accumulate. We applied temporal sequence mining to linked claims and mortality data to identify ordered clinical diagnostic trajectories associated with suicide death. Results: The cohort included 3 647 059 insured Maryland residents aged 10 years or older with available claims records in the Maryland Suicide Data Warehouse from January 1, 2016, to December 31, 2020, among whom 768 suicide deaths were ascertained through medical examiner linkage. Sequential pattern mining of ICD-10-CM diagnoses grouped into Clinical Classifications Software Refined categories identified 89 221 candidate sequences, of which 1 816 remained significantly associated with suicide death in time-varying Cox models. Adjusted hazard ratios (AHRs) ranged from 2.4 to 134.1. Two-thirds of significant trajectories ended in physical conditions, and approximately half crossed from psychiatric to physical endpoints. Among suicide decedents, 62% were exposed to at least 1 significant sequence (median, 16 per case); median sequence duration was 18.7 months, and median time from completion to death was 13.1 months. In landmark analyses, among patients with depression who later developed suicidal ideation (n = 26 356), the path through anxiety, then anemia, was associated with higher risk (AHR, 4.6; 95% CI, 2.2-9.5), whereas the anxiety-only path was not (AHR, 1.3; 95% CI, 0.8-2.1). Among patients with anxiety who later developed hypertension (n = 149 215), the path through history of self-harm was associated with higher risk (AHR, 32.0; 95% CI, 16.6-61.6). Associations were generally consistent across sex and age. Conclusions: Temporal ordering of clinical conditions may carry prognostic information for suicide death. Clinical trajectories incorporating physical illness within psychiatric sequences identified higher-risk groups. These findings suggest that opportunities for risk detection may extend beyond psychiatric settings and that suicide risk signals may be fragmented across care settings and not apparent within isolated encounters.

8
Optimisation of steatotic liver disease screening algorithm for resource-poor settings using machine learning

Mettananda, C.; Sivasumithran, K.; Ranaweera, L.; Madhubhashini, A.; Ranawaka, C.; Pathmeswaran, A.; Dassanayake, A.

2026-06-10 endocrinology 10.64898/2026.06.09.26355306 medRxiv
Top 0.5%
0.8%
Show abstract

Background The European Association for the Study of the Liver (ESAL) - Steatotic Liver Disease (SLD) screening algorithm involves two steps; initial screening with FIB-4 followed by referral for vibration-controlled transient elastography (VCTE) in patients likely to have significant fibrosis (SF). However, VCTE is not widely available in resource-limited settings. Aim To optimise the EASL SLD screening algorithm for resource-poor settings using machine learning (ML). Methods We analysed data from 964 adults aged [≥]35 years who underwent VCTE at a tertiary referral centre in Sri Lanka between November 2024 and 2025. Multiple ML models using different methods and variable combinations were trained on 80% of the dataset and tested on the remaining 20%. Best models were selected based on performance and externally validated using data from 430 patients who underwent VCTE before November 2024. Model performance was compared with the FIB-4 using confusion matrices. Results A Random Forest model incorporating age, AST, ALT, and platelet count separately, rather than using FIB-4, outperformed. The all-variable ML model showed the best predictive performance for SF, with accuracy of 77.2%, recall of 0.762, precision of 0.778, and AUC-ROC of 0.818. The variables used in the model, in descending order of feature importance, were AST, platelet count, BMI, ALT, age, diabetes mellitus, hypertension, dyslipidaemia, sex, family history, hypothyroidism, diabetes complication and smoking. External validation demonstrated 75.1% accuracy and an AUC of 0.779. When used as the first step of the SLD screening algorithm, the all-variable ML model identified 37 (17.1%) additional true positives and reduced false-negative diagnoses by 50% compared with FIB-4. Conclusions ML-based models were more effective than the FIB-4 score as the first-line screening tool for VCTE referral, substantially improving the identification of patients with significant fibrosis in this South Asian cohort.

9
Neonatal Brain Network Integration Trajectories Predict Neurodevelopment in Congenital Heart

Harasymiw, L.; Kuang, A.; Xu, D.; Scheffler, A.; George, E.; Peyvandi, S.; McQuillen, P.

2026-06-08 pediatrics 10.64898/2026.06.06.26355074 medRxiv
Top 0.6%
0.7%
Show abstract

Background: Infants with critical congenital heart disease (CHD) are at high risk for abnormal brain development and later neurodevelopmental impairment. We hypothesized that the trajectory of perioperative whole-brain network development would predict neurodevelopmental outcomes in early childhood. Methods: This prospective longitudinal cohort of neonates with critical CHD (n = 97) underwent preoperative and/or postoperative brain MRI with diffusion imaging. Whole-brain network measures were derived from structural connectomes. Neurodevelopment was assessed between 1 and 4 years using the Bayley Scales of Infant and Toddler Development. Results: White matter injury was associated with slower perioperative growth in global efficiency (p = 0.013), a measure of network integration, whereas cardiac physiology was not associated with network development. Infants with greater perioperative increases in global efficiency had higher cognitive (p = 0.001), language (p < 0.001), and motor (p = 0.008) scores. For each 1-standard deviation increase in the trajectory of global efficiency, cognitive scores increased by 8.2 points (95% CI, 3.64-12.78), independent of brain injury and socioeconomic factors. Conclusion: In infants with critical CHD, longitudinal whole-brain network development was associated with neurodevelopment across multiple domains. Early network development may represent a candidate biomarker of neurodevelopmental risk and resilience in this population.

10
From Charting Burden to Workflow Signal: Retrospective Validation of Documentation-Density Measures for ICU Complexity and Long-Stay Risk

Collier, A.

2026-06-06 health informatics 10.64898/2026.06.04.26354922 medRxiv
Top 0.7%
0.7%
Show abstract

Background Electronic health record documentation patterns may reflect workflow complexity, monitoring intensity, and operational strain in intensive care settings. However, documentation-derived features can be sensitive to local documentation culture, data capture systems, and outcome definitions. Retrospective validation across multiple datasets is therefore needed before these signals are used in workflow intelligence or clinical AI governance tools. Objective To evaluate whether documentation-density and documentation-timing features show reproducible retrospective signal for ICU workflow complexity and long-stay proxy outcomes across de-identified critical care datasets, while distinguishing workflow and long-stay associations from unsupported claims about mortality prediction, burden reduction, or deployment readiness. Methods We synthesized retrospective validation results from de-identified ICU and workflow datasets generated through a prespecified documentation-density validation program. Feature families included Documentation Burden Score style features, Shift-End Documentation Rate style features, documentation reliability style metadata, and all-documentation feature sets where available. Outcomes included long ICU length of stay proxies, mortality where available, and workflow proxy endpoints. Models compared baseline feature sets with enhanced models containing documentation-density or workflow features. Performance was summarized using area under the receiver operating characteristic curve, Brier score where reported, delta AUROC, bootstrap confidence intervals where reported, and label-shuffle controls where available. Results The strongest external long-stay proxy evidence came from the NWICU chartevents analysis, which included 28,612 ICU stays, 20,267 stays with chart events, and 9,619,759 chart events. For ICU length of stay greater than the median, baseline AUROC was 0.5252. Enhanced AUROC was 0.9512 for Documentation Burden Score features, 0.9214 for Shift-End Documentation Rate features, 0.8470 for documentation reliability style features, and 0.9517 for all documentation features. Corresponding label-shuffle enhanced AUROCs were near random, ranging from 0.4897 to 0.5064. For ICU length of stay greater than the 75th percentile, baseline AUROC was 0.5155. Enhanced AUROC was 0.9433 for Documentation Burden Score features, 0.9194 for Shift-End Documentation Rate features, 0.8118 for documentation reliability style features, and 0.9427 for all documentation features, with label-shuffle enhanced AUROCs from 0.4836 to 0.4999. Additional retrospective support was observed in eICU workflow analyses, HiRID first-24-hour documentation-density analyses, MIMIC-IV HF ICU internal analyses, MIMIC-IV-Note metadata extensions, and nursing-chart or lab density proxy analyses. However, cross-institution discrimination transfer was weak without recalibration, and several analyses remained proxy validations rather than final clinical validations. Conclusions Documentation-density and documentation-timing features show promising retrospective signal for ICU workflow complexity and long-stay proxy outcomes, especially in NWICU chartevents and selected internal dataset-specific analyses. These findings support further preregistered, prospective, silent-mode validation of documentation-derived workflow intelligence. They do not establish prospective clinical performance, mortality reduction, clinician burden reduction, autonomous deterioration prediction, or deployment readiness.

11
Development and Prospective Validation of Predictive Model for Early Hemodynamic Deterioration in Critical Care: A Multicenter Study

Nagori, A.; Singh, P.; Firdos, S.; Devadiga, A.; Vats, V.; Gupta, A.; Bandhey, H.; Ailavadi, P.; Awasthi, R.; Narotam, N.; Mishra, A.; Lodha, R.; Sethi, T.

2026-06-10 intensive care and critical care medicine 10.64898/2026.06.05.26353765 medRxiv
Top 0.8%
0.5%
Show abstract

High-frequency physiological monitoring in ICUs can identify impending deterioration hours before clinical recognition yet extracting reliable early-warning signals from noisy vital-sign streams remains challenging. We present SIgnose, an interpretable prediction framework for early detection of abnormal shock index (SI), built from routinely monitored vital signs using physiologic variability and nonlinear time-series features. SIgnose was developed on the eICU Collaborative Research Database and externally validated on the MIMIC-III adult database and a pediatric SafeICU cohort (AIIMS New Delhi), with additional prospective validation in the pediatric ICU. We benchmarked three representation strategies: (i) engineered physiologic variability and nonlinear time-series features, (ii) deep learning, and (iii) Llama-3.1-8B embeddings with low-rank adaptation. Physiologic variability features consistently demonstrated superior cross-cohort generalization. The final model used 3,970 features from five vital signs to predict abnormal SI up to 8 hours ahead, achieving AUROC 0.861 (95% CI 0.859-0.863) and AUPRC 0.927 (95% CI 0.925-0.929) on eICU. External validation yielded AUROC 0.870 (95% CI 0.863-0.876) and AUPRC 0.935 (95% CI 0.930-0.940) on MIMIC-III, and AUROC 0.875 (95% CI 0.863-0.888) and AUPRC 0.915 (95% CI 0.898-0.930) on SafeICU; prospective pediatric validation (n = 88) achieved AUROC 0.885 (95% CI 0.868-0.902) and AUPRC 0.911 (95% CI 0.882-0.936). SHAP interpretability analysis identified heart rate variability, respiratory trend dynamics, and multi-scale blood pressure variability as key early-warning signatures. These findings establish SIgnose as a reproducible, low-compute, early-warning framework and demonstrate that physiologic variability features provide robust, generalizable representations for early deterioration detection across adult and pediatric critical care.

12
Validity and Limitations of the Empatica E4 Wristband for Autonomic and Thermoregulatory Sleep Monitoring Against Concurrent Polysomnography: A Wearanize+ Dataset Study

Parry, Y. D.; Briganti, G.

2026-06-11 health informatics 10.64898/2026.06.10.26355348 medRxiv
Top 0.9%
0.4%
Show abstract

The Empatica E4 wristband provides continuous multi-modal physiological monitoring including blood volume pulse (BVP), electrodermal activity (EDA) and skin temperature (TEMP) but its validity for sleep-stage-specific autonomic and thermoregulatory monitoring has not been systematically evaluated against concurrent polysomnography (PSG). Using the Wearanize+ dataset which provides synchronised PSG, Empatica E4, and Zmax EEG recordings from 100 home-recorded participants; a systematic validation of Empatica E4 physiological signals against PSG ground truth across five sleep stages was conducted. Of 100 participants, 92 had Empatica data; 69 met Zmax EEG signal quality criteria and formed the analysis sample. Heart rate (HR) from the pre-computed Empatica HR channel showed valid stage-specific patterns (Wake: 70.9 bpm, N3: 61.2 bpm) and moderate inter-device MeanNN correspondence with PSG ECG (Spearman r=0.35-0.42 across stages). Skin temperature showed the expected thermoregulatory pattern (Wake: 33.92C, N3: 35.48C) and is recommended for downstream analyses. Tonic EDA showed an inverted stage pattern attributable to wrist sweat accumulation during deep sleep, representing a known confound for wrist-worn EDA during sleep. Phasic EDA showed plausible patterns and may be used with caution. These findings establish a validated feature set for Empatica E4 sleep research and directly inform multimodal psychiatric biomarker studies using the Wearanize+ dataset.

13
Efficacy and Safety of Traditional Chinese Medicine in Obesity Management: A Systematic Review and Meta-Analysis

Zhang, Y.; Wang, Y.

2026-06-08 endocrinology 10.64898/2026.06.04.26354905 medRxiv
Top 1%
0.3%
Show abstract

Background: Obesity is a global health crisis, contributing to chronic diseases such as diabetes, cardiovascular disease, and metabolic syndrome. Traditional Chinese Medicine (TCM) has been used in East Asia to manage obesity, but evidence on its efficacy and safety remains limited. This systematic review and meta-analysis assess clinical evidence from randomized controlled trials (RCTs) on TCM for obesity treatment. Methods: We systematically searched PubMed, EMBASE, Cochrane Library, and Web of Science from inception to April 2026. Eligible RCTs compared TCM interventions with placebo or conventional treatments in obese patients. Two reviewers independently conducted screening, data extraction, and quality assessment. Meta-analysis was conducted using a random-effects model to calculate pooled weighted mean differences (WMD) and odds ratios (OR) for body weight, BMI, waist-to-hip ratio (WHR), lipid profiles, and adverse events. Results: A total of 33 randomized controlled trials (RCTs) involving 3,053 participants were included in the analysis. TCM significantly reduced body weight (WMD = -5.86 kg, 95% CI: -7.51 to -4.21), BMI (WMD = -2.82 kg/m{superscript 2}, 95% CI: -3.38 to -2.25), and WHR (WMD = -0.04, 95% CI: -0.06 to -0.02). Lipid profiles improved, with reductions in total cholesterol (WMD = -0.82 mmol/L), triglycerides (WMD = -0.65 mmol/L), LDL-C (WMD = -0.39 mmol/L), and increased HDL-C (WMD = 0.29 mmol/L) (all p < 0.001). Adverse events were infrequent, with no significant difference observed between TCM and control groups (OR = 0.51, 95% CI: 0.24 to 1.08). Funnel plots indicated no publication bias. Conclusion: TCM appears effective in reducing body weight and improving lipid profiles in obese patients, with a low incidence of adverse events. It may serve as a complementary treatment for obesity, though further high-quality RCTs are needed to confirm these findings and assess long-term outcomes.

14
A Comparison of Manual and Automated Approaches to Developing Computable Algorithms for Identifying Acute Pancreatitis

Bann, M. A.; Carrell, D. S.; Gruber, S.; Heagerty, P. J.; Williamson, B. D.; Nelson, J. C.; Hazlehurst, B.; Felcher, A.; Nyongesa, D. B.; Slaughter, M. T.; Sapp, D. S.; Cronkite, D. J.; Ball, R.; Floyd, J. S.

2026-06-08 health informatics 10.64898/2026.06.05.26354934 medRxiv
Top 1%
0.3%
Show abstract

Objective: Clinical phenotyping methods that rely on clinical and informatics expertise can be time-intensive and costly. We tested both manual and highly automated approaches using electronic health record (EHR) data to identify an FDA Sentinel Initiative health outcome of interest, acute pancreatitis. Materials and Methods: We trained and evaluated machine learning algorithms using EHR data with two approaches: a custom approach that included manually curated features and trained on outcomes data validated with medical record review, and a highly automated approach that greatly simplifies and automates feature engineering and relies on low-cost silver-standard outcomes for model training. Results: Custom algorithms using manually curated structured claims data discriminated cases from non-cases with a high degree of accuracy (cv-AUC 0.89 [95%CI 0.84-0.94]); the inclusion of natural language processing (NLP)-derived covariates from clinical notes increased performance slightly (cv-AUC 0.91[95%CI 0.86-0.97]). The automated algorithm trained on the outcome count of diagnosis codes performed less well (AUC 0.80 [95% CI 0.75-0.85]) but improved using maximum lipase value as an outcome (AUC 0.88 [95% CI 0.84-0.92]). At a positive predictive value of 90%, the custom algorithm had a sensitivity of 92%, the automated algorithm trained on diagnosis code count had a sensitivity of 45%, and the automated algorithm trained on maximum lipase value had a sensitivity of 84%. However, a prediction rule derived by clinicians during chart review was nearly as accurate (maximum lipase value [&ge;] 3 times upper limit of normal; AUC 0.86, PPV 85%, sensitivity 92%). Discussion: Machine learning algorithms with manually curated structured data and NLP features trained on validated outcomes data successfully identified validated events. Use of an outcome in the automated model based on specific phenotype knowledge (maximum lipase value) allowed for performance similar to the custom model and with considerably less resources.

15
Metatranscriptomics-Derived Disease Risk Scores as a Preventive, Diagnostic, and Treatment Support Tool

Hu, L.; Bass, M.; Patridge, E.; Molusky, M.; Antoine, G.; Vuyisich, M.; Banavar, G.

2026-06-06 genetic and genomic medicine 10.64898/2026.05.29.26354333 medRxiv
Top 1%
0.2%
Show abstract

Background: Chronic diseases and symptom syndromes often develop after prolonged biological changes that may precede formal diagnosis. RNA-based metatranscriptomics captures active microbial and human gene expression and may provide a functional layer for disease risk evaluation. To address this translational gap, we developed and validated a Disease Risk Score (DRS) framework that integrates metatranscriptome-derived pathway activity scores from stool, saliva, and blood samples, and evaluated its potential clinical utility as an adjunct risk-evaluation tool. Methods: DRS uses disease-specific sets of pathway activity scores derived from stool and saliva microbial functions, stool and saliva microbial taxa, and blood human gene expression. For each disease, 'not optimal' pathway scores are aggregated into a normalized cumulative odds ratio, or cOR, using score-level odds ratios, statistical significance, and literature-supported biological relevance derived from a Development Cohort of 22,369 individuals. A cOR [&ge;] 5 is defined as high risk. Performance is evaluated in an independent Validation Cohort of 15,908 individuals using self-reported diseases as the reference. Disease support requires both significant cOR separation between self-reported and not-reported (Cohen's d [&ge;] 0.2) and risk ratio enrichment of self-reported disease among individuals classified as high risk (95% CI of Risk Ratio > 1). Results: Of 20 initially evaluated diseases, 15 meet the prespecified validation criteria on the independent validation cohort: ADHD, anxiety, chronic fatigue syndrome, depression, GERD, hypertension, inflammatory bowel disease, IBS-C, IBS-D, insomnia, MASLD, obesity, obstructive sleep apnea, Sjogren's syndrome, and type 2 diabetes. Five selected clinical scenarios illustrate how DRS can support clinician-mediated decision making, including IBS subtype reclassification, improved diagnostic acceptance in IBS-D, personalized lifestyle counseling in MASLD and early type 2 diabetes, and diagnostic uncertainty in atypical GERD. Conclusions: DRS is a metatranscriptomics-based risk-stratification framework that aggregates active microbial and human pathway signals into interpretable disease-specific risk estimates across a wide range of disease conditions. Validation against self-reported disease labels in an independent cohort shows significant risk enrichment for each of 15 diseases. DRS is intended as an adjunct to clinical evaluation: a decision support tool in situations where routine care encounters uncertainty, delay, or low patient engagement. Future prospective studies using clinically adjudicated endpoints are needed to assess calibration and clinical outcomes.

16
Genetic Susceptibility to Incisional Hernia: Evaluation of Hernia Polygenic Risk Scores

Pregnall, A. M.; Hornick, M. M.; Broach, R. B.; Judy, R.; DePaolo, J.; Yuan, S.; Levin, M.; Fischer, J. P.; Damrauer, S. M.; Wachtel, H.

2026-06-11 genetic and genomic medicine 10.64898/2026.06.10.26355374 medRxiv
Top 1%
0.2%
Show abstract

Objectives: Incisional hernia (IH) affects 13-30% of people after abdominal surgery, resulting in substantial morbidity and costs. While clinical risk factors have been studied extensively, genomic risk for IH is incompletely understood. We aimed to evaluate the impact of polygenic risk scores (PRS) on IH risk prediction. Methods] We created and evaluated three PRS for abdominal hernia, ventral hernia and latent hernia susceptibility for prediction of IH in an institutional biobank. The primary outcome was defined as the diagnosis or repair of an IH based on ICD-9/10-CM/PCS and CPT codes. Clinical covariates included age, sex, body mass index (BMI), smoking status, index procedure type, and perioperative surgical site infection. A phenome-wide association study (PheWAS) was performed to assess clinical associations with increased PRS. We then tested the ability of the PRS to improve prediction for IH by modeling clinical covariates with and without PRS in patients who underwent abdominal surgery. Model performance was assessed using 10 iterations of 5-fold cross-validation to estimate Brier scores and area under the receiver operating characteristic curve (AUROC), which were compared using cross-model Bayesian analysis of variance. Results: In 55,809 subjects, assessed PRS was significantly associated with incisional, umbilical, and ventral hernia on PheWAS, with 1.19 greater odds of developing IH per 1-SD increase in PRS (95% CI: 1.13-1.25, P \< 0.001). Of 9,909 subjects who underwent qualifying abdominal surgery, 706 developed IH. In this cohort, the latent hernia susceptibility PRS was associated with a 16% increased hazard of developing IH per 1-SD increase (HR 1.16; 95% CI: 1.07-1.26; P \< 0.001). Compared to a predictive model using clinical covariates (Brier score = 0.047, 95% CI: 0.046-0.048; AUROC = 0.660, 95% CI: 0.653-0.666), addition of the PRS showed similar Brier score and AUROC estimates (Brier score = 0.047, 95% CI: 0.046-0.048; AUROC: 0.667, 95% CI: 0.661-0.673) at five years. Cross-model Bayesian analysis demonstrated \>99% probability of practical equivalence when trying to detect a difference of [&ge;] 0.02. Conclusion: All three PRS for hernia were independently associated with IH, suggesting that genomic factors contribute significantly to IH development. However, none of the three PRS meaningfully improved clinical IH risk prediction in patients who underwent abdominal surgery. This suggests that clinical comorbidities and surgical techniques may be equally as important as genomic architecture.

17
Beyond event-rate enrichment: proteomic risk scores for mechanism-aware prevention trial design

Fieggen, J.; Simond, G.; Segal, B. M.; Noori, A.; Thakurta, A.; Butler, C. C.; Clifton, D. A.; Clifton, L.

2026-06-10 health informatics 10.64898/2026.06.09.26355266 medRxiv
Top 2%
0.2%
Show abstract

Background. Blood-based biomarkers are increasingly proposed for identifying high-risk individuals before clinical disease and for making prevention-oriented trials more efficient. Prognostic enrichment can increase event rates, but trial efficiency also depends on whether the intervention effect is preserved in the enriched population. Methods. Using the UK Biobank Pharma Proteomics Project, we trained disease-specific proteomic risk scores (ProRS) from 2,916 plasma proteins with elastic-net Cox models. We compared ProRS, polygenic risk scores (PRS), and combined PRS--ProRS scores across ten incident diseases. We estimated cumulative incidence and theoretical two-arm time-to-event trial sample sizes across risk strata. To evaluate effect preservation, we examined six intervention-analogue exposure--outcome pairs spanning genetic (PCSK9/coronary artery disease, APOE/Alzheimer's disease, PPARG/type 2 diabetes, IL23R/Crohn's disease), behavioural (physical activity/all-cause mortality), and pharmacological (RAAS inhibitors versus calcium channel blockers/coronary artery disease) examples. Results. ProRS outperformed PRS for 9 of 10 diseases (median C-index 0.75 versus 0.61). ProRS and PRS were weakly correlated (median Pearson |r| = 0.04), and joint PRS--ProRS stratification identified groups with higher observed incidence than either score alone for several endpoints. In the top risk quartile, combined-score enrichment reduced theoretical required sample sizes by 32--74\% under a fixed 20\% relative hazard reduction. These gains were not always preserved when stratum-specific intervention-analogue effects were used. Effects were broadly preserved for APOE/Alzheimer's disease and physical activity/mortality. The PPARG/type 2 diabetes effect attenuated toward the null under all three score types, showing that event-rate enrichment does not guarantee effect preservation. For IL23R/Crohn's disease and the antihypertensive comparison, point estimates differed across score types -- preserved under polygenic but attenuated under proteomic enrichment -- but confidence intervals were wide and overlapping. Conclusions. Proteomic risk scores can identify high-event-rate populations for prevention-oriented trials, but event-rate enrichment alone is insufficient for trial design. Biomarker-guided enrichment should evaluate mechanism-specific effect preservation and may be preferable as a stratification or adaptive-design variable rather than as a restrictive eligibility criterion.

18
Polygenic risk scores associate with asthma phenotypes and proteomic analyses implicate IL1R1 in two family-based studies

Lee, S.; Moll, M.; Mendez, K.; Prince, N.; Lasky-Su, J.; Lutz, S. M.; Weiss, S. T.; Lange, C.; Kelly, R. S.; Hecker, J.

2026-06-11 genetic and genomic medicine 10.64898/2026.06.06.26355045 medRxiv
Top 2%
0.2%
Show abstract

Despite its high prevalence and the discovery of hundreds of genetic associations, the genetic determinants and heterogeneous manifestations of asthma remain incompletely understood. Incorporating polygenic risk scores (PRS) into asthma research offers a powerful approach to quantify inherited susceptibility, refine risk profiles, and advance mechanistic understanding of disease development. For this study, we leveraged whole-genome sequencing (WGS) data from two family-based cohorts of childhood asthma - the Genetics of Asthma in Costa Rica Study (GACRS) and the Childhood Asthma Management Program (CAMP) - to examine the transmission profiles of externally derived asthma PRS and their associations with clinical phenotypes in children with asthma. To further elucidate molecular mechanisms, we integrated large-scale external genome-wide association study (GWAS) summary statistics and genetic prediction models of protein abundance in a two-step proteome-wide association study (PWAS) of asthma. Our findings provide robust evidence supporting the validity of externally derived asthma PRS (asthma PRS association p-value p={10}^{-24} [GACRS and CAMP trios combined] for the Global Biobank Meta-analysis Initiative [GBMI]) and reveal consistent associations with spirometry measures and atopy markers across both studies, as 13 of 21 traits (62%) were significantly associated with the GBMI-PRS in the meta-analysis after multiple-testing correction. Moreover, the results of the integrative proteomic analysis implicate IL-1 signaling in the etiology of asthma, reinforcing the candidacy of IL1R1 antagonists for drug repurposing.

19
Registered Report: Artifact Index for Capacitive Electrocardiography Acquired with an Armchair

Warnecke, J. M.; Baumgärtel, D.; Bollmann, J.; Deserno, T. M.

2026-06-09 health informatics 10.64898/2026.06.03.26353526 medRxiv
Top 2%
0.2%
Show abstract

Background Continuous health monitoring enables early detection of diseases and improves therapeutic outcomes. Non-intrusive biosignal sensors, such as capacitive ECG (cECG), offer a practical solution for daily monitoring in private environments, such as smart homes and vehicles. However, artifacts reduce signal quality and compromise reliability. Methods Following a registered report protocol (Warnecke JM et al. Plos One. 2021; 16(7):e0254780), we record data of 44 subjects and develop an artifact index for cECG. We use three signal quality indices (SQIs): the correlation of QRS complexes (corSQI), the R-peak detection consistency (bSQI) and the absolute amplitude ratio (aSQI). Our index classifies overlapping 10s segments with a step-width of 2s into clean or artifact segments. We label a 2s interval as artifacts if all five overlapping segments indicate artifacts. We record cECGs using an armchair with integrated electrodes in a single-arm study involving 44 subjects performing two activities -- reading and watching television (TV); for 11 minutes each. We record a time-synchronized reference ECG with skin electrodes on the chest. To evaluate the artifact index, we compare it with manually generated ground truth. Moreover, we evaluate the clothing materials cotton, linen, jeans, and polyester in 5 subjects. Results Watching TV results in longer, continuously clean signal durations than reading. On average, 88.3% of the signal has a minimum continuous clean duration of 10s, versus 79.8% during reading. All clothing configurations achieve a clean signal duration exceeding 10s. Among the SQI metrics, bSQI performs best, achieving an accuracy of 90.7% and an F1 score of 79.9%. Combining the three SQI metrics in a voting approach improves accuracy to 92.0% and F1 score to 82.1%. Discussion Our artifact index automatically distinguishes clean from artifact cECG segments, promoting health monitoring in unsupervised real-world settings, earlier disease detection, and preventive health management. A limitation is the investigation of only two scenarios (reading and watching TV).

20
Foundation model-based tool for automated ulcerative colitis histology scoring demonstrates non-inferiority to pathologists across multiple scoring indices

Tahir, W.; Shamshoian, J.; Tauber, J.; Clinton, L. K.; Griffin, M.; Shah, C.; Singh, G.; Fahy, D.; Sucipto, K.; Brosnan-Cashman, J.; Altepeter, T. A.; Bhattacharya, S.; Crandall, W.; Duan, C.; Gale, J. D.; Gupta, V.; Haarmann, H.; Harpaz, N.; Hooper, A. T.; Horowitz, J.; Hurtado-Lorenzo, A.; Hussaini, B. E.; Jairath, V.; Jones, A.; Kostiuk, B.; Kukreja, A.; Laroux, F. S.; Lissoos, T.; McBride, R. B.; Najdawi, F.; Nayyar, A.; Osterman, M. T.; Panchal, P.; Ruane, D.; Travis, S.; Visvanathan, S.; Wilson, L.; Jayson, C.

2026-06-11 pathology 10.64898/2026.06.09.26355212 medRxiv
Top 2%
0.2%
Show abstract

In clinical trials for ulcerative colitis (UC), pathologists assess disease severity through standardized histological indices, including the Geboes Score, Robarts Histopathology Index (RHI), and Nancy Histologic Index (NHI). Despite strong associations with clinical outcomes, histologic scoring suffers from inter- and intra-reader variability, and consensus criteria for histologic remission remain uncertain. Through a consortium approach, we developed an artificial intelligence-based measurement (AIM) tool for scoring histology in UC mucosal biopsies (AIM-HI UC). This model, trained on a large dataset of UC biopsies (N=10,230), utilizes additive multiple instance learning models leveraging PLUTO, a pathology foundation model, that predict each of the Geboes subgrades, from which the Geboes grade-level score, RHI, and NHI can be calculated. Evaluation of this model on a standalone verification set including clinical trial specimens established algorithm non-inferiority and/or superiority relative to standard qualified pathologists through comparison of algorithm-consensus and pathologist-consensus agreement metrics (non-inferior if difference >-0.1, superior if difference >0, inclusive of confidence intervals). AIM-HI UC was determined to be non-inferior to pathologists (N=3) for the prediction of all seven Geboes subgrades, grade-level Geboes, RHI, NHI, histologic improvement (GS<3.1), 2A histologic remission (GS<2A.0), and 2B histologic remission (GS<2B.0). AIM-HI UC was superior to pathologists for several Geboes subgrades (GS 0, GS 1, GS 2B, and GS 5), as well as grade-level Geboes, RHI, and positive percent agreement of 2A histologic remission. The model was shown to be greater than 99% repeatable for all histologic scoring metrics examined. Model-derived scores were shown to strongly correlate with canonical histologic features of inflammation, including the proportion of total epithelium that is inflamed (Spearman r=0.83; p<0.01), the proportion of neutrophils localized within crypt epithelium (Spearman r=0.83, p<0.01), and the amount of mucosal area classified as erosion or ulceration (Spearman r=0.80, p<0.01). Overall, these results suggest that AIM-HI UC has the potential to improve consistency of UC histology interpretation, providing a path toward standardization of UC histology scoring in clinical trials.